šŸ“¢ 13 Critical Questions About LLMs – Seeking Insight and Collaboration

Hi all,

I’ve been reflecting on the current architecture and limitations of large language models (LLMs) and would love your thoughts on some fundamental questions. These aren’t meant as criticisms but as honest questions meant to provoke discussion and collaboration. I’m especially interested in forming a small group to dive deeper into these ideas. Here are the 13 issues I see:


:magnifying_glass_tilted_left: Fundamental Questions:

  1. Why don’t LLMs truly understand the consequences of their outputs?
    Their actions are detached from any notion of ā€œcostā€ or ā€œpenalty,ā€ unlike how humans or even natural evolution works.

  2. Why are current models optimized purely for correct answers, ignoring incorrect ones?
    This leads to an incomplete learning cycle that doesn’t encode the ā€œpain of being wrong.ā€

  3. Why is reinforcement learning designed only around rewards and not meaningful penalties?
    In nature, both reward and punishment drive learning.

  4. Why are all outputs treated equally, without tracking the ā€œpriceā€ paid for each decision?
    Shouldn’t each output have an associated energy cost, just like in physical systems?

  5. Can a machine learn ā€œregretā€ or understand ā€œfailureā€?
    Current systems don’t appear to build up internal warnings or self-protection mechanisms after failure.

  6. Why does symbolic reasoning remain disconnected from statistical models?
    Can we design a hybrid that respects both deductive logic and probability?

  7. Why is generalization in LLMs treated as intelligence when it often leads to hallucination?
    The current metric of ā€œbeing able to generate everythingā€ seems flawed.

  8. Why are models trained on correctness, but then evaluated based on fluency and coherence?
    This mismatch encourages models to sound right, not be right.

  9. Why do we lack mechanisms to penalize mass hallucination in generated outputs?
    Without real consequences, models never refine their error understanding.

  10. Why does long-context reasoning still fail under real-world constraints?
    Even when models have memory, they don’t really plan like humans.

  11. Why does training for the right answer often make models perform worse in unknown situations?
    Overfitting to ā€œtruthā€ may harm adaptability.

  12. Why do context limitations block models from applying knowledge when it matters most (in action)?

  13. Why are so-called enhancement techniques (e.g., chain-of-thought, hallucination harnessing) still fundamentally in the wrong direction?
    They seem to layer more ā€œintelligent-sounding behaviorā€ without solving the underlying architectural flaws.


:light_bulb: Personal Thought:

I believe we need a different foundation—possibly one that integrates symbolic negative rules, irreversible energy loss for each decision, and a true consequence-based learning loop. I’m not a programmer, but I’ve thought deeply about the architecture, and I’d love to form a group or channel where we can brainstorm how to actually implement this or co-develop experiments.

If anyone sees alignment with their research or curiosity, please reach out or comment. Maybe we can build a better system—one decision cost at a time.

2 Likes

I’m going to try my best here…

1/2) Their actions are detached from any notion of ā€œcostā€ or ā€œpenalty,ā€ unlike how humans or even natural evolution works. Models, even in reinforcement learning, are trained on maximizing probabilities to generate the next token/ reward function. In regards to RL, the function is shaping loss to increase the reward function and not decreasing penalties for bad output. Generally, models never internalize a cost function that punishes mistakes in a graded, systemic way. If a model hallucinates, unless a human explicitly labels that output as incorrect during fine-tuning, the model’s parameters receive no corrective gradient.

3/5) Why is reinforcement learning designed only around rewards and not meaningful penalties? We don’t have a reasonable idea of how bad a signal/output is and can’t track it to reasonably score that.

  1. It depends on the domain. You may need to overfit to pick up on patterns like in Math, but yes overfitting is generally bad.
2 Likes

Thanks again for your reply. I’d like to clarify and expand on my original question, because I realize I may have made it sound too abstract.

In current AI training setups, especially reinforcement learning, the model is typically driven by maximizing rewards—while incorrect outputs may get less reward or no reward at all. But in nature, or in human behavior, every action—right or wrong—comes with a cost. For example, even if I make the right decision, I still lose time, energy, or some resource. And if I make the wrong decision, the cost is even higher.

This ā€œnatural costā€ is not something explicitly labeled by a human. It exists independent of success or failure, like a kind of built-in entropy or energy consumption. That’s what seems missing in current models.

So here’s my more grounded question:

Is it possible to design an AI model where every decision—correct or incorrect—incurs a small cost, simulating the natural ā€˜resource burn’ of existing in the world? And on top of that, could we allow both rewards and penalties to emerge more from the environment or the task context, rather than from hand-labeled signals?

I think this kind of structure might make models act more cautiously, reason more realistically, and avoid blindly maximizing reward at all cost—because they now have something to lose, always.

Would love to know what others think.

1 Like

I believe so, you would just need more systems in play. For example, like an LLM Judge, a verifier model, thing of that nature. You would have to train on hallucinations and then penalize the generator. If you don’t want to then you could generate ā€œconsistencyā€ prompts and and punish fallacies or non-consistent correct answers.

For the idea of ā€œRegretā€:
I would imagine you could do a persistent memory buffer that just stores all the states where the predictions were so horrendous that they get stored away and rechecked every prompt so that the model avoids that state

1 Like

Thank you so much!
Your questions are exactly what I’ve been thinking about myself.

For now, I imagine these small models as ā€œveto agentsā€ — each trained to reject specific categories of error (like logical fallacies, broken grammar, inconsistent units in physics/math, etc.).
If any of them raise a red flag, the main model is forced to re-generate. Over time, this pressure could help the main model internalize the avoidance of such mistakes.

As for training: yes, my current idea is to use supervised learning based on curated sets of labeled mistakes.

I’ll definitely look into Tree of Thoughts and Directional Decoding — thanks for the pointers!

If you’re interested, I’d love to brainstorm more with you, or maybe invite you to a group when I set one up.

1 Like